Hands-on Machine Learning with R
Boookclub R-Ladies Utrecht and R-Ladies Den Bosch
log/Box-coxApproximate (linear) relationship between continuous response variable and set of predictor variables
Libraries
library(dplyr) # for data manipulation
library(ggplot2) # for graphics
library(caret) # for cross-validation, etc.
library(rsample) # you have to scroll back in the book to detect
# necessary for initial_split
library(vip) # variable importance
#library(pdp) # is used in section on varible importanceIf Y and X are (approx) linearly related then: \(Y_i = \beta_0 + \beta_1X_i + \epsilon_i \text{, for } i = 1, ..., n, \text{ and } \epsilon_i \sim N(0,\sigma^2)\)
\(\beta_0\): intercept, average response when X = 0
\(\beta_1\): slope, increase in average response per 1 unit increase in X
Call:
lm(formula = Sale_Price ~ Gr_Liv_Area, data = ames_train)
Residuals:
Min 1Q Median 3Q Max
-474682 -30794 -1678 23353 328183
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15938.173 3851.853 4.138 3.65e-05 ***
Gr_Liv_Area 109.667 2.421 45.303 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 56790 on 2047 degrees of freedom
Multiple R-squared: 0.5007, Adjusted R-squared: 0.5004
F-statistic: 2052 on 1 and 2047 DF, p-value: < 2.2e-16
model2 <- lm(Sale_Price ~ Gr_Liv_Area + Year_Built, data = ames_train): :model2a <- lm(Sale_Price ~ Gr_Liv_Area + Year_Built + Gr_Liv_Area:Year_Built, data = ames_train)model3 <- lm(Sale_Price ~ ., data = ames_train)For this example, “best”model: lowest RMSE via cross-validation
set.seed(123) # for reproducibility
(cv_model1 <- train(
form = Sale_Price ~ Gr_Liv_Area,
data = ames_train,
method = "lm",
trControl = trainControl(method = "cv", number = 10)
))Linear Regression
2049 samples
1 predictor
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 1843, 1844, 1844, 1844, 1844, 1844, ...
Resampling results:
RMSE Rsquared MAE
56644.76 0.510273 38851.99
Tuning parameter 'intercept' was held constant at a value of TRUE
The (averaged) RMSE for the 3 main effect models:
[1] 56644.76
[1] 46865.68
[1] 41691.74
Interpret the cv result as:
When applied to unseen data, the predictions model 3 makes are, on average, about 41691.74 off from the actual sale price.
Be sure the assumptions hold:
Address multicollinearity for instance by using Principal Components as predictors
set.seed(123)
cv_model_pcr <- train(
Sale_Price ~ .,
data = ames_train,
method = "pcr",
trControl = trainControl(method = "cv",
number = 10),
preProcess = c("zv", "center", "scale"),
tuneLength = 100)
bestTune <- cv_model_pcr$bestTune[1,1]
ggplot(cv_model_pcr) +
geom_vline(xintercept = bestTune,
color = "red")Question I have: - Why useful? It brings RMSE down, but do we get insight in importance of predictors? Different from regular regression? - I thought there are max ncol(data) PC’s
Supervised dimension reduction procedure:
- that finds new features - that not only captures most information in original features, - but also are related to the response - PLS places highest weight on variables most strongly related to response
set.seed(123)
cv_model_pls <- train(
Sale_Price ~ .,
data = ames_train,
method = "pls",
trControl = trainControl(method = "cv",
number = 10),
preProcess = c("zv", "center", "scale"),
tuneLength = 30
)
bestTune <- cv_model_pls$bestTune[1,1]
ggplot(cv_model_pls) +
geom_vline(xintercept = bestTune,
color = "red"), , 1 comps
.outcome
MS_SubClassOne_Story_1945_and_Older -1148.4437622
MS_SubClassOne_Story_with_Finished_Attic_All_Ages -147.5518230
MS_SubClassOne_and_Half_Story_Unfinished_All_Ages -337.5525120
MS_SubClassOne_and_Half_Story_Finished_All_Ages -865.0621444
MS_SubClassTwo_Story_1946_and_Newer 1790.3091140
MS_SubClassTwo_Story_1945_and_Older -294.2140110
MS_SubClassTwo_and_Half_Story_All_Ages 142.9321733
MS_SubClassSplit_or_Multilevel -116.6483406
MS_SubClassSplit_Foyer -233.9429173
MS_SubClassDuplex_All_Styles_and_Ages -499.7477346
MS_SubClassOne_Story_PUD_1946_and_Newer 474.0964976
MS_SubClassTwo_Story_PUD_1946_and_Newer -529.4888895
MS_SubClassPUD_Multilevel_Split_Level_Foyer -339.6653643
MS_SubClassTwo_Family_conversion_All_Styles_and_Ages -463.2790399
MS_ZoningResidential_High_Density -251.1720053
MS_ZoningResidential_Low_Density 1251.4545722
MS_ZoningResidential_Medium_Density -1435.1590405
MS_ZoningA_agr -222.2313362
MS_ZoningC_all -588.5308147
MS_ZoningI_all -188.4592485
Lot_Frontage 946.5746179
Lot_Area 1268.8685184
StreetPave 406.0874043
AlleyNo_Alley_Access 541.3907149
AlleyPaved 7.2636545
Lot_ShapeSlightly_Irregular 1286.1324281
Lot_ShapeModerately_Irregular 604.3108370
Lot_ShapeIrregular 56.7309504
Land_ContourHLS 996.4744503
Land_ContourLow 40.4632900
Land_ContourLvl -313.1923148
UtilitiesNoSeWa -57.5003575
UtilitiesNoSewr -177.2202607
Lot_ConfigCulDSac 738.5826335
Lot_ConfigFR2 -69.9826939
Lot_ConfigFR3 -61.4271755
Lot_ConfigInside -313.3624865
Land_SlopeMod 260.4807670
Land_SlopeSev 117.4113000
NeighborhoodCollege_Creek 385.2450000
NeighborhoodOld_Town -1007.7948023
NeighborhoodEdwards -745.5067238
NeighborhoodSomerset 759.7798061
NeighborhoodNorthridge_Heights 2063.7545226
NeighborhoodGilbert 103.9408237
NeighborhoodSawyer -593.6664520
NeighborhoodNorthwest_Ames 88.4540426
NeighborhoodSawyer_West 35.1577458
NeighborhoodMitchell -209.0198068
NeighborhoodBrookside -660.1834000
NeighborhoodCrawford 324.2803697
NeighborhoodIowa_DOT_and_Rail_Road -930.5129973
NeighborhoodTimberland 661.5970764
NeighborhoodNorthridge 1462.4573725
NeighborhoodStone_Brook 1216.4405229
NeighborhoodSouth_and_West_of_Iowa_State_University -348.4826044
NeighborhoodClear_Creek 184.8274460
NeighborhoodMeadow_Village -522.6223815
NeighborhoodBriardale -477.5235866
NeighborhoodBloomington_Heights 77.0457274
NeighborhoodVeenker 267.6426497
NeighborhoodNorthpark_Villa -219.1978942
NeighborhoodBlueste -116.7666952
NeighborhoodGreens 36.1280979
NeighborhoodGreen_Hills 185.5883156
NeighborhoodLandmark -58.1624595
Condition_1Feedr -520.1024234
Condition_1Norm 501.4098425
Condition_1PosA 468.1369150
Condition_1PosN 297.6228330
Condition_1RRAe -247.5469100
Condition_1RRAn -31.9018760
Condition_1RRNe -140.0522136
Condition_1RRNn -16.9337473
Condition_2Feedr -246.2331177
Condition_2Norm -64.6223929
Condition_2PosA 701.2227337
Condition_2PosN 146.9542949
Condition_2RRAe 12.0203530
Condition_2RRNn -157.6691047
Bldg_TypeTwoFmCon -473.8892388
Bldg_TypeDuplex -499.7477346
Bldg_TypeTwnhs -472.2808302
Bldg_TypeTwnhsE 267.4836556
House_StyleOne_and_Half_Unf -375.2427958
House_StyleOne_Story -227.7763357
House_StyleSFoyer -371.7836497
House_StyleSLvl -153.8870512
House_StyleTwo_and_Half_Fin 108.2338188
House_StyleTwo_and_Half_Unf 19.8667937
House_StyleTwo_Story 1086.4492939
Overall_QualPoor -515.9361022
Overall_QualFair -746.2640662
Overall_QualBelow_Average -1244.8539342
Overall_QualAverage -1769.0582498
Overall_QualAbove_Average -663.0113467
Overall_QualGood 757.5514070
Overall_QualVery_Good 1981.3986060
Overall_QualExcellent 2169.6935109
Overall_QualVery_Excellent 1675.3822962
Overall_CondPoor -214.6983234
Overall_CondFair -713.1128877
Overall_CondBelow_Average -741.1560633
Overall_CondAverage 1737.1924359
Overall_CondAbove_Average -827.8599673
Overall_CondGood -652.7074623
Overall_CondVery_Good -384.1605700
Overall_CondExcellent 203.1139955
Year_Built 2695.1687566
Year_Remod_Add 2590.1583129
Roof_StyleGable -1221.2730706
Roof_StyleGambrel -233.1829012
Roof_StyleHip 1332.4572710
Roof_StyleMansard -33.8872793
Roof_StyleShed 4.7670700
Roof_MatlCompShg -309.7073488
Roof_MatlMembran 80.2168595
Roof_MatlMetal -1.2216871
Roof_MatlRoll -58.1624595
`Roof_MatlTar&Grv` -41.6040028
Roof_MatlWdShake 177.1746492
Roof_MatlWdShngl 624.1316792
Exterior_1stAsphShn -168.0885830
Exterior_1stBrkComm -88.5815376
Exterior_1stBrkFace 138.5922305
Exterior_1stCemntBd 688.1153806
Exterior_1stHdBoard -499.2389760
Exterior_1stImStucc 107.3630416
Exterior_1stMetalSd -613.2598674
Exterior_1stPlywood -238.8880993
Exterior_1stStone 145.3152758
Exterior_1stStucco -167.0741211
Exterior_1stVinylSd 1568.7952257
`Exterior_1stWd Sdng` -900.2362128
Exterior_1stWdShing -291.4766324
Exterior_2ndAsphShn -141.8074904
`Exterior_2ndBrk Cmn` -224.9339687
Exterior_2ndBrkFace 77.6712394
Exterior_2ndCBlock -134.9662920
Exterior_2ndCmentBd 667.4204831
Exterior_2ndHdBoard -449.7773259
Exterior_2ndImStucc 289.7950459
Exterior_2ndMetalSd -555.8733663
Exterior_2ndPlywood -359.2400775
Exterior_2ndStone -115.2956981
Exterior_2ndStucco -188.2596978
Exterior_2ndVinylSd 1558.7438005
`Exterior_2ndWd Sdng` -813.6823158
`Exterior_2ndWd Shng` -343.3368950
Mas_Vnr_TypeBrkFace 1324.0797599
Mas_Vnr_TypeCBlock -133.6420880
Mas_Vnr_TypeNone -2008.6375162
Mas_Vnr_TypeStone 1418.7258844
Mas_Vnr_Area 2506.6417188
Exter_QualFair -675.6420742
Exter_QualGood 2211.6485539
Exter_QualTypical -2827.8137420
Exter_CondFair -678.8625683
Exter_CondGood -275.8450857
Exter_CondPoor -222.2313362
Exter_CondTypical 526.8548066
FoundationCBlock -1708.9537892
FoundationPConc 2515.6398926
FoundationSlab -579.5911461
FoundationStone -161.4926451
FoundationWood -23.9167970
Bsmt_QualFair -757.1793721
Bsmt_QualGood 1086.4828924
Bsmt_QualNo_Basement -741.6552906
Bsmt_QualPoor -121.7242519
Bsmt_QualTypical -2182.3400136
Bsmt_CondFair -774.2048865
Bsmt_CondGood 436.9811493
Bsmt_CondNo_Basement -741.6552906
Bsmt_CondPoor -168.5258661
Bsmt_CondTypical 591.6054239
Bsmt_ExposureGd 1665.4995527
Bsmt_ExposureMn 120.8918187
Bsmt_ExposureNo -1308.1346959
Bsmt_ExposureNo_Basement -719.9244922
BsmtFin_Type_1BLQ -610.7652891
BsmtFin_Type_1GLQ 2131.0630668
BsmtFin_Type_1LwQ -419.7614684
BsmtFin_Type_1No_Basement -741.6552906
BsmtFin_Type_1Rec -780.1299281
BsmtFin_Type_1Unf -463.8420503
BsmtFin_SF_1 -629.2909141
BsmtFin_Type_2BLQ -137.5110203
BsmtFin_Type_2GLQ 184.3727280
BsmtFin_Type_2LwQ -179.7471593
BsmtFin_Type_2No_Basement -741.6552906
BsmtFin_Type_2Rec -223.7964055
BsmtFin_Type_2Unf 508.1943062
BsmtFin_SF_2 57.1029570
Bsmt_Unf_SF 894.3406865
Total_Bsmt_SF 2959.0305928
HeatingGasA 423.6047097
HeatingGasW -95.7462524
HeatingGrav -362.5116813
HeatingOthW -78.0255196
HeatingWall -337.0108942
Heating_QCFair -687.9197855
Heating_QCGood -641.7433800
Heating_QCPoor -256.7533068
Heating_QCTypical -1584.0900659
Central_AirY 1292.9713618
ElectricalFuseF -649.2898911
ElectricalFuseP -283.6421503
ElectricalMix -150.8567401
ElectricalSBrkr 1186.8736015
ElectricalUnknown -17.7742372
First_Flr_SF 2918.4037670
Second_Flr_SF 1420.8988741
Low_Qual_Fin_SF -180.3170043
Gr_Liv_Area 3405.8746809
Bsmt_Full_Bath 1273.3848143
Bsmt_Half_Bath -120.0428108
Full_Bath 2683.0864418
Half_Bath 1418.1869064
Bedroom_AbvGr 730.5469592
Kitchen_AbvGr -608.4626679
Kitchen_QualFair -794.0775241
Kitchen_QualGood 1493.7039581
Kitchen_QualPoor -97.2264777
Kitchen_QualTypical -2528.3486291
TotRms_AbvGrd 2406.5489508
FunctionalMaj2 -357.4390308
FunctionalMin1 -304.9278764
FunctionalMin2 -352.4450124
FunctionalMod -85.8385470
FunctionalSal -279.7994393
FunctionalSev -159.5422694
FunctionalTyp 610.0041908
Fireplaces 2300.6763147
Fireplace_QuFair -135.6926132
Fireplace_QuGood 1768.4144038
Fireplace_QuNo_Fireplace -2347.7719093
Fireplace_QuPoor -325.3321701
Fireplace_QuTypical 774.7168622
Garage_TypeBasment -208.5574909
Garage_TypeBuiltIn 1154.5776635
Garage_TypeCarPort -344.2897666
Garage_TypeDetchd -1758.8989288
Garage_TypeMore_Than_Two_Types -124.6895280
Garage_TypeNo_Garage -1145.5657892
Garage_FinishNo_Garage -1145.5657892
Garage_FinishRFn 738.0433643
Garage_FinishUnf -1989.9707336
Garage_Cars 3169.2163871
Garage_Area 3069.7262496
Garage_QualFair -773.6486669
Garage_QualGood 251.7865262
Garage_QualNo_Garage -1145.5657892
Garage_QualPoor -238.6576009
Garage_QualTypical 1289.0879398
Garage_CondFair -702.3630117
Garage_CondGood 25.5039873
Garage_CondNo_Garage -1145.5657892
Garage_CondPoor -397.4828690
Garage_CondTypical 1391.2522862
Paved_DrivePartial_Pavement -377.7102850
Paved_DrivePaved 1329.4806851
Wood_Deck_SF 1584.3250001
Open_Porch_SF 1422.4367197
Enclosed_Porch -565.7707888
Three_season_porch 200.0853967
Screen_Porch 576.8210990
Pool_Area 150.1700090
Pool_QCFair 0.1025169
Pool_QCGood 94.2727281
Pool_QCNo_Pool -285.0605942
Pool_QCTypical -23.9167970
FenceGood_Wood -473.5143028
FenceMinimum_Privacy -810.5623631
FenceMinimum_Wood_Wire -209.9229815
FenceNo_Fence 942.0871082
Misc_FeatureGar2 -56.4460450
Misc_FeatureNone 275.5386810
Misc_FeatureOthr -81.2845310
Misc_FeatureShed -260.0896309
Misc_Val -63.7221032
Mo_Sold 110.3830413
Year_Sold -182.9407781
Sale_TypeCon 103.6155002
Sale_TypeConLD -248.9987301
Sale_TypeConLI -198.3928891
Sale_TypeConLw -113.2881421
Sale_TypeCWD 61.0896242
Sale_TypeNew 1567.8837282
Sale_TypeOth -195.7197040
Sale_TypeVWD -58.1624595
`Sale_TypeWD ` -875.1734085
Sale_ConditionAdjLand -250.5974073
Sale_ConditionAlloca -114.7880266
Sale_ConditionFamily -300.0729306
Sale_ConditionNormal -479.6536897
Sale_ConditionPartial 1551.7712987
Longitude -1205.3617909
Latitude 1343.9591591
, , 2 comps
.outcome
MS_SubClassOne_Story_1945_and_Older -937.5084623
MS_SubClassOne_Story_with_Finished_Attic_All_Ages -30.0413629
MS_SubClassOne_and_Half_Story_Unfinished_All_Ages -6.7195434
MS_SubClassOne_and_Half_Story_Finished_All_Ages 58.6687276
MS_SubClassTwo_Story_1946_and_Newer 1391.5299269
MS_SubClassTwo_Story_1945_and_Older 693.3851359
MS_SubClassTwo_and_Half_Story_All_Ages 918.9398935
MS_SubClassSplit_or_Multilevel -444.5899808
MS_SubClassSplit_Foyer -351.7886676
MS_SubClassDuplex_All_Styles_and_Ages -527.0501714
MS_SubClassOne_Story_PUD_1946_and_Newer -378.3505978
MS_SubClassTwo_Story_PUD_1946_and_Newer -1443.7044608
MS_SubClassPUD_Multilevel_Split_Level_Foyer -664.1650713
MS_SubClassTwo_Family_conversion_All_Styles_and_Ages -134.4908053
MS_ZoningResidential_High_Density -211.2376436
MS_ZoningResidential_Low_Density 1059.8576439
MS_ZoningResidential_Medium_Density -1027.3965469
MS_ZoningA_agr -294.7912025
MS_ZoningC_all -528.8065962
MS_ZoningI_all -97.9054746
Lot_Frontage 1852.5310876
Lot_Area 2728.3632554
StreetPave 414.8440870
AlleyNo_Alley_Access 207.8607872
AlleyPaved -145.1710598
Lot_ShapeSlightly_Irregular 1452.0326063
Lot_ShapeModerately_Irregular 1021.6388954
Lot_ShapeIrregular -280.8350073
Land_ContourHLS 2262.0773609
Land_ContourLow 236.3595219
Land_ContourLvl -1268.1056955
UtilitiesNoSeWa -265.8186211
UtilitiesNoSewr -169.3797225
Lot_ConfigCulDSac 1403.8898657
Lot_ConfigFR2 -610.5489920
Lot_ConfigFR3 -120.5294800
Lot_ConfigInside -645.8473087
Land_SlopeMod 975.3409123
Land_SlopeSev 352.4435712
NeighborhoodCollege_Creek -1062.7950456
NeighborhoodOld_Town -298.0101183
NeighborhoodEdwards -982.2576861
NeighborhoodSomerset 396.5673477
NeighborhoodNorthridge_Heights 3846.4606562
NeighborhoodGilbert -1558.7332206
NeighborhoodSawyer -838.7687834
NeighborhoodNorthwest_Ames -111.2326058
NeighborhoodSawyer_West -793.3424646
NeighborhoodMitchell -307.2311125
NeighborhoodBrookside -55.2722645
NeighborhoodCrawford 1570.4820874
NeighborhoodIowa_DOT_and_Rail_Road -476.7321490
NeighborhoodTimberland 869.5921893
NeighborhoodNorthridge 2985.9130591
NeighborhoodStone_Brook 2652.3684562
NeighborhoodSouth_and_West_of_Iowa_State_University -154.7470892
NeighborhoodClear_Creek 409.0027312
NeighborhoodMeadow_Village -869.9136958
NeighborhoodBriardale -898.7877495
NeighborhoodBloomington_Heights -810.5661452
NeighborhoodVeenker 469.9263969
NeighborhoodNorthpark_Villa -464.4057180
NeighborhoodBlueste -201.8133106
NeighborhoodGreens 2.6610907
NeighborhoodGreen_Hills 820.7675871
NeighborhoodLandmark -169.8362848
Condition_1Feedr -597.6337836
Condition_1Norm 476.3349653
Condition_1PosA 1165.4848218
Condition_1PosN 406.6725871
Condition_1RRAe -726.7875335
Condition_1RRAn -338.2199032
Condition_1RRNe -202.7952031
Condition_1RRNn -52.4961554
Condition_2Feedr -196.2821043
Condition_2Norm -502.5998866
Condition_2PosA 1957.3735215
Condition_2PosN -313.9108155
Condition_2RRAe -17.5367637
Condition_2RRNn -100.7039022
Bldg_TypeTwoFmCon -123.1550133
Bldg_TypeDuplex -527.0501714
Bldg_TypeTwnhs -1220.9087185
Bldg_TypeTwnhsE -768.4466594
House_StyleOne_and_Half_Unf 23.5392788
House_StyleOne_Story -586.5550195
House_StyleSFoyer -635.8085977
House_StyleSLvl -554.6574828
House_StyleTwo_and_Half_Fin 667.6002950
House_StyleTwo_and_Half_Unf 608.4692571
House_StyleTwo_Story 865.4226125
Overall_QualPoor -472.2657827
Overall_QualFair -689.1264158
Overall_QualBelow_Average -1390.2880791
Overall_QualAverage -1897.6359094
Overall_QualAbove_Average -1605.7740426
Overall_QualGood -680.0255529
Overall_QualVery_Good 2768.1116241
Overall_QualExcellent 4938.0657604
Overall_QualVery_Excellent 4225.6346241
Overall_CondPoor -57.1750202
Overall_CondFair -839.9970346
Overall_CondBelow_Average -778.5136244
Overall_CondAverage 355.5292004
Overall_CondAbove_Average -433.0386811
Overall_CondGood 231.7050719
Overall_CondVery_Good 302.7393721
Overall_CondExcellent 1062.6540334
Year_Built 1375.0231660
Year_Remod_Add 2381.9350040
Roof_StyleGable -2778.0064971
Roof_StyleGambrel 34.3271324
Roof_StyleHip 2889.1721513
Roof_StyleMansard -18.2846344
Roof_StyleShed -10.6303679
Roof_MatlCompShg -681.3873815
Roof_MatlMembran 275.7901319
Roof_MatlMetal -6.0661338
Roof_MatlRoll -85.1099993
`Roof_MatlTar&Grv` -118.2498750
Roof_MatlWdShake 194.9412918
Roof_MatlWdShngl 2060.7527449
Exterior_1stAsphShn 57.7754831
Exterior_1stBrkComm 263.9345826
Exterior_1stBrkFace 1216.8267784
Exterior_1stCemntBd 1382.6710325
Exterior_1stHdBoard -1079.2155048
Exterior_1stImStucc 166.3499242
Exterior_1stMetalSd 207.5271413
Exterior_1stPlywood -470.8971852
Exterior_1stStone 398.7101223
Exterior_1stStucco 184.7585178
Exterior_1stVinylSd 130.2638312
`Exterior_1stWd Sdng` -141.6654164
Exterior_1stWdShing -243.3292184
Exterior_2ndAsphShn 6.1340494
`Exterior_2ndBrk Cmn` -352.4916266
Exterior_2ndBrkFace 593.0158318
Exterior_2ndCBlock -92.9575598
Exterior_2ndCmentBd 1316.9006442
Exterior_2ndHdBoard -956.5004725
Exterior_2ndImStucc 499.7481977
Exterior_2ndMetalSd 292.7391745
Exterior_2ndPlywood -565.6376055
Exterior_2ndStone 131.1507162
Exterior_2ndStucco 198.9176678
Exterior_2ndVinylSd 136.3124285
`Exterior_2ndWd Sdng` 147.0411807
`Exterior_2ndWd Shng` -199.6243533
Mas_Vnr_TypeBrkFace 1018.1582839
Mas_Vnr_TypeCBlock -334.7483063
Mas_Vnr_TypeNone -1899.8887751
Mas_Vnr_TypeStone 1812.8927133
Mas_Vnr_Area 4299.8060740
Exter_QualFair -437.4726143
Exter_QualGood 1126.3266949
Exter_QualTypical -2955.1780062
Exter_CondFair -350.0871462
Exter_CondGood 405.7172221
Exter_CondPoor -294.7912025
Exter_CondTypical -345.7233089
FoundationCBlock -1880.9751925
FoundationPConc 1907.0215926
FoundationSlab -228.7681918
FoundationStone 174.6766618
FoundationWood -184.2149991
Bsmt_QualFair -496.3993270
Bsmt_QualGood -1425.7384636
Bsmt_QualNo_Basement -327.7323268
Bsmt_QualPoor -41.4259320
Bsmt_QualTypical -1698.5203396
Bsmt_CondFair -370.6089641
Bsmt_CondGood 736.0839704
Bsmt_CondNo_Basement -327.7323268
Bsmt_CondPoor -4.6643532
Bsmt_CondTypical -88.5177157
Bsmt_ExposureGd 3407.7938718
Bsmt_ExposureMn 126.6423970
Bsmt_ExposureNo -2162.8614901
Bsmt_ExposureNo_Basement -362.8396160
BsmtFin_Type_1BLQ -470.4603921
BsmtFin_Type_1GLQ 2530.1792221
BsmtFin_Type_1LwQ -290.6337007
BsmtFin_Type_1No_Basement -327.7323268
BsmtFin_Type_1Rec -479.2902880
BsmtFin_Type_1Unf -1265.2791120
BsmtFin_SF_1 -1068.3788630
BsmtFin_Type_2BLQ -177.6448014
BsmtFin_Type_2GLQ 635.8449397
BsmtFin_Type_2LwQ -13.9926168
BsmtFin_Type_2No_Basement -327.7323268
BsmtFin_Type_2Rec -146.6467279
BsmtFin_Type_2Unf -96.6643119
BsmtFin_SF_2 642.7251508
Bsmt_Unf_SF 495.0318382
Total_Bsmt_SF 4628.4771101
HeatingGasA -287.9268031
HeatingGasW 549.9063697
HeatingGrav -77.5682996
HeatingOthW 82.5110995
HeatingWall -170.1831344
Heating_QCFair -383.7822776
Heating_QCGood -788.7878590
Heating_QCPoor -199.8365494
Heating_QCTypical -1617.0170810
Central_AirY 608.1071135
ElectricalFuseF -215.0829909
ElectricalFuseP -41.5484701
ElectricalMix -67.3748687
ElectricalSBrkr 473.0723367
ElectricalUnknown -99.8737345
First_Flr_SF 5375.2363277
Second_Flr_SF 2522.9597633
Low_Qual_Fin_SF 339.6611460
Gr_Liv_Area 6251.8653190
Bsmt_Full_Bath 2354.0830990
Bsmt_Half_Bath -195.9230466
Full_Bath 3438.8434722
Half_Bath 1501.9771437
Bedroom_AbvGr 1501.3034132
Kitchen_AbvGr -470.4079083
Kitchen_QualFair -396.4610015
Kitchen_QualGood -64.2706379
Kitchen_QualPoor -3.5249495
Kitchen_QualTypical -2932.4444118
TotRms_AbvGrd 4483.8916289
FunctionalMaj2 -422.6333824
FunctionalMin1 -234.2191337
FunctionalMin2 -123.9664572
FunctionalMod 227.8848749
FunctionalSal -399.2678154
FunctionalSev -398.2479794
FunctionalTyp 405.2722231
Fireplaces 3954.5427922
Fireplace_QuFair -339.9747727
Fireplace_QuGood 2875.3520265
Fireplace_QuNo_Fireplace -3471.8034609
Fireplace_QuPoor -491.4246734
Fireplace_QuTypical 537.5658114
Garage_TypeBasment -467.1285565
Garage_TypeBuiltIn 1334.8220224
Garage_TypeCarPort -588.4703468
Garage_TypeDetchd -1317.4474641
Garage_TypeMore_Than_Two_Types -124.6513666
Garage_TypeNo_Garage -482.4973970
Garage_FinishNo_Garage -482.4973970
Garage_FinishRFn -724.2405724
Garage_FinishUnf -1565.3043964
Garage_Cars 3929.4357551
Garage_Area 4123.0878453
Garage_QualFair -125.5564243
Garage_QualGood 734.4198324
Garage_QualNo_Garage -482.4973970
Garage_QualPoor -54.7368882
Garage_QualTypical 141.4588905
Garage_CondFair -428.1829039
Garage_CondGood 261.9521529
Garage_CondNo_Garage -482.4973970
Garage_CondPoor -189.6901765
Garage_CondTypical 609.9526078
Paved_DrivePartial_Pavement 18.5792425
Paved_DrivePaved 505.5649376
Wood_Deck_SF 2482.3265048
Open_Porch_SF 1776.7662370
Enclosed_Porch 461.1671179
Three_season_porch 461.0639096
Screen_Porch 1700.1969051
Pool_Area 210.3403281
Pool_QCFair -16.2242372
Pool_QCGood -274.3085284
Pool_QCNo_Pool -662.6390024
Pool_QCTypical 108.9413801
FenceGood_Wood -280.4913986
FenceMinimum_Privacy -434.7606050
FenceMinimum_Wood_Wire -157.1656058
FenceNo_Fence 391.3298297
Misc_FeatureGar2 -0.3404585
Misc_FeatureNone 352.9078926
Misc_FeatureOthr 51.4330178
Misc_FeatureShed -253.3382609
Misc_Val -571.7104741
Mo_Sold 18.7040377
Year_Sold -313.8377364
Sale_TypeCon 276.0887625
Sale_TypeConLD -112.8348366
Sale_TypeConLI -351.0251116
Sale_TypeConLw -240.9182437
Sale_TypeCWD 198.4752139
Sale_TypeNew 1570.8178157
Sale_TypeOth -249.0030254
Sale_TypeVWD -129.0249487
`Sale_TypeWD ` -673.1532978
Sale_ConditionAdjLand -24.1377113
Sale_ConditionAlloca 125.8615142
Sale_ConditionFamily -799.0015021
Sale_ConditionNormal -113.5054866
Sale_ConditionPartial 1540.5550499
Longitude -222.3334046
Latitude 1643.8770521
, , 3 comps
.outcome
MS_SubClassOne_Story_1945_and_Older -657.796964
MS_SubClassOne_Story_with_Finished_Attic_All_Ages 82.528781
MS_SubClassOne_and_Half_Story_Unfinished_All_Ages 491.173805
MS_SubClassOne_and_Half_Story_Finished_All_Ages 682.645158
MS_SubClassTwo_Story_1946_and_Newer 1112.874597
MS_SubClassTwo_Story_1945_and_Older 970.915842
MS_SubClassTwo_and_Half_Story_All_Ages 920.410705
MS_SubClassSplit_or_Multilevel -715.740201
MS_SubClassSplit_Foyer 66.512272
MS_SubClassDuplex_All_Styles_and_Ages -1495.060295
MS_SubClassOne_Story_PUD_1946_and_Newer -958.436938
MS_SubClassTwo_Story_PUD_1946_and_Newer -1050.978655
MS_SubClassPUD_Multilevel_Split_Level_Foyer -478.567827
MS_SubClassTwo_Family_conversion_All_Styles_and_Ages -798.353525
MS_ZoningResidential_High_Density -23.860477
MS_ZoningResidential_Low_Density 501.113266
MS_ZoningResidential_Medium_Density -747.372334
MS_ZoningA_agr -688.270405
MS_ZoningC_all -1071.912608
MS_ZoningI_all -317.415054
Lot_Frontage 1147.821858
Lot_Area 2391.046317
StreetPave 624.278243
AlleyNo_Alley_Access -190.981496
AlleyPaved 293.543077
Lot_ShapeSlightly_Irregular 1199.690086
Lot_ShapeModerately_Irregular 1793.565109
Lot_ShapeIrregular -1876.493191
Land_ContourHLS 2532.979091
Land_ContourLow -402.456962
Land_ContourLvl -73.592231
UtilitiesNoSeWa -699.443994
UtilitiesNoSewr -445.464859
Lot_ConfigCulDSac 2116.789179
Lot_ConfigFR2 -917.876926
Lot_ConfigFR3 -164.396989
Lot_ConfigInside -491.763924
Land_SlopeMod 801.635304
Land_SlopeSev -348.963023
NeighborhoodCollege_Creek -1258.801689
NeighborhoodOld_Town -579.115132
NeighborhoodEdwards -2070.874822
NeighborhoodSomerset 1211.351600
NeighborhoodNorthridge_Heights 5324.777666
NeighborhoodGilbert -2943.703372
NeighborhoodSawyer -610.801946
NeighborhoodNorthwest_Ames -643.280950
NeighborhoodSawyer_West -1019.144965
NeighborhoodMitchell -25.463573
NeighborhoodBrookside 610.155991
NeighborhoodCrawford 2179.463438
NeighborhoodIowa_DOT_and_Rail_Road -332.213969
NeighborhoodTimberland 729.944911
NeighborhoodNorthridge 4622.757564
NeighborhoodStone_Brook 3997.307912
NeighborhoodSouth_and_West_of_Iowa_State_University -526.330084
NeighborhoodClear_Creek -226.914738
NeighborhoodMeadow_Village -879.819381
NeighborhoodBriardale -407.597015
NeighborhoodBloomington_Heights -1509.275134
NeighborhoodVeenker 268.445939
NeighborhoodNorthpark_Villa 242.220504
NeighborhoodBlueste 48.499637
NeighborhoodGreens 64.792598
NeighborhoodGreen_Hills 2293.412259
NeighborhoodLandmark -23.921148
Condition_1Feedr -1487.119074
Condition_1Norm 1625.945801
Condition_1PosA 1338.376242
Condition_1PosN 36.586032
Condition_1RRAe -909.653133
Condition_1RRAn -475.533806
Condition_1RRNe -203.700663
Condition_1RRNn -287.961705
Condition_2Feedr -309.678856
Condition_2Norm -5.161076
Condition_2PosA 2653.239441
Condition_2PosN -2391.394938
Condition_2RRAe -208.166023
Condition_2RRNn -86.675435
Bldg_TypeTwoFmCon -713.169416
Bldg_TypeDuplex -1495.060295
Bldg_TypeTwnhs -804.086530
Bldg_TypeTwnhsE -1217.535509
House_StyleOne_and_Half_Unf 474.141536
House_StyleOne_Story -880.808936
House_StyleSFoyer -248.868701
House_StyleSLvl -864.783668
House_StyleTwo_and_Half_Fin 356.546580
House_StyleTwo_and_Half_Unf 243.726159
House_StyleTwo_Story 830.022614
Overall_QualPoor -644.607044
Overall_QualFair -1314.271580
Overall_QualBelow_Average -1743.027325
Overall_QualAverage -2078.702423
Overall_QualAbove_Average -2323.397222
Overall_QualGood -1401.473524
Overall_QualVery_Good 3276.151475
Overall_QualExcellent 7733.684825
Overall_QualVery_Excellent 5799.258458
Overall_CondPoor -388.030850
Overall_CondFair -1864.585240
Overall_CondBelow_Average -1654.276013
Overall_CondAverage -651.388763
Overall_CondAbove_Average 129.848537
Overall_CondGood 1281.681862
Overall_CondVery_Good 1252.375744
Overall_CondExcellent 1758.248550
Year_Built 1729.692272
Year_Remod_Add 3225.434315
Roof_StyleGable -1686.407216
Roof_StyleGambrel -214.879154
Roof_StyleHip 2125.533491
Roof_StyleMansard -682.803331
Roof_StyleShed -486.478325
Roof_MatlCompShg 311.016224
Roof_MatlMembran 303.226629
Roof_MatlMetal -118.035740
Roof_MatlRoll -293.364252
`Roof_MatlTar&Grv` -898.567054
Roof_MatlWdShake -357.032572
Roof_MatlWdShngl 3594.716287
Exterior_1stAsphShn 68.133082
Exterior_1stBrkComm 399.183204
Exterior_1stBrkFace 2172.416320
Exterior_1stCemntBd 1470.601559
Exterior_1stHdBoard -1055.521249
Exterior_1stImStucc 188.914784
Exterior_1stMetalSd 768.372180
Exterior_1stPlywood -616.407816
Exterior_1stStone 454.074801
Exterior_1stStucco -251.649479
Exterior_1stVinylSd 79.746652
`Exterior_1stWd Sdng` -652.058240
Exterior_1stWdShing -517.579871
Exterior_2ndAsphShn -129.251921
`Exterior_2ndBrk Cmn` 227.641748
Exterior_2ndBrkFace 711.738722
Exterior_2ndCBlock -54.799204
Exterior_2ndCmentBd 1321.777865
Exterior_2ndHdBoard -929.256483
Exterior_2ndImStucc 584.716346
Exterior_2ndMetalSd 893.459275
Exterior_2ndPlywood -1044.292122
Exterior_2ndStone -12.414018
Exterior_2ndStucco -267.396783
Exterior_2ndVinylSd 82.782586
`Exterior_2ndWd Sdng` 142.092383
`Exterior_2ndWd Shng` -280.643635
Mas_Vnr_TypeBrkFace 113.552546
Mas_Vnr_TypeCBlock -777.728187
Mas_Vnr_TypeNone -525.976818
Mas_Vnr_TypeStone 1140.766269
Mas_Vnr_Area 4733.015528
Exter_QualFair -950.415264
Exter_QualGood 606.675198
Exter_QualTypical -3149.629666
Exter_CondFair -984.968559
Exter_CondGood 781.652610
Exter_CondPoor -688.270405
Exter_CondTypical -478.161088
FoundationCBlock -2073.685795
FoundationPConc 2042.730770
FoundationSlab -437.859627
FoundationStone 211.012761
FoundationWood -190.152058
Bsmt_QualFair -554.284997
Bsmt_QualGood -2489.323388
Bsmt_QualNo_Basement -557.532422
Bsmt_QualPoor -12.616472
Bsmt_QualTypical -2129.585212
Bsmt_CondFair -888.329079
Bsmt_CondGood 846.004002
Bsmt_CondNo_Basement -557.532422
Bsmt_CondPoor -98.957799
Bsmt_CondTypical 269.184501
Bsmt_ExposureGd 3996.881645
Bsmt_ExposureMn -405.151651
Bsmt_ExposureNo -2265.857965
Bsmt_ExposureNo_Basement -654.049233
BsmtFin_Type_1BLQ -433.220031
BsmtFin_Type_1GLQ 3567.341829
BsmtFin_Type_1LwQ -719.687522
BsmtFin_Type_1No_Basement -557.532422
BsmtFin_Type_1Rec -417.765345
BsmtFin_Type_1Unf -2295.316721
BsmtFin_SF_1 -2050.556799
BsmtFin_Type_2BLQ -341.883088
BsmtFin_Type_2GLQ 1159.182511
BsmtFin_Type_2LwQ 255.457374
BsmtFin_Type_2No_Basement -557.532422
BsmtFin_Type_2Rec -556.430081
BsmtFin_Type_2Unf -221.999953
BsmtFin_SF_2 859.201813
Bsmt_Unf_SF -727.164210
Total_Bsmt_SF 4788.601393
HeatingGasA -47.741657
HeatingGasW 523.270354
HeatingGrav -244.693314
HeatingOthW -52.497437
HeatingWall -428.174229
Heating_QCFair -886.098701
Heating_QCGood -789.470919
Heating_QCPoor -384.847003
Heating_QCTypical -1832.892776
Central_AirY 1125.267022
ElectricalFuseF -316.087274
ElectricalFuseP 84.449800
ElectricalMix -81.098716
ElectricalSBrkr 337.001466
ElectricalUnknown 40.764980
First_Flr_SF 5483.803955
Second_Flr_SF 3235.173739
Low_Qual_Fin_SF 115.102795
Gr_Liv_Area 6909.265891
Bsmt_Full_Bath 3219.538150
Bsmt_Half_Bath -685.011810
Full_Bath 3736.390529
Half_Bath 1941.854086
Bedroom_AbvGr 798.786696
Kitchen_AbvGr -1813.360813
Kitchen_QualFair -757.519086
Kitchen_QualGood -1080.902122
Kitchen_QualPoor 63.062670
Kitchen_QualTypical -3544.458402
TotRms_AbvGrd 4046.764882
FunctionalMaj2 -991.639412
FunctionalMin1 -905.763288
FunctionalMin2 -431.307927
FunctionalMod -84.831386
FunctionalSal -956.370313
FunctionalSev -1175.971852
FunctionalTyp 1680.422662
Fireplaces 3381.636035
Fireplace_QuFair -577.900829
Fireplace_QuGood 1606.286689
Fireplace_QuNo_Fireplace -2478.311050
Fireplace_QuPoor -664.251904
Fireplace_QuTypical 66.988573
Garage_TypeBasment -1213.373562
Garage_TypeBuiltIn 1299.094121
Garage_TypeCarPort -989.977836
Garage_TypeDetchd -855.314111
Garage_TypeMore_Than_Two_Types -515.080776
Garage_TypeNo_Garage 39.159237
Garage_FinishNo_Garage 39.159237
Garage_FinishRFn -2013.330521
Garage_FinishUnf -1053.070232
Garage_Cars 4215.932393
Garage_Area 4360.960895
Garage_QualFair 32.862865
Garage_QualGood 963.833180
Garage_QualNo_Garage 39.159237
Garage_QualPoor -78.966970
Garage_QualTypical -496.013722
Garage_CondFair -878.639064
Garage_CondGood 236.502435
Garage_CondNo_Garage 39.159237
Garage_CondPoor -431.919200
Garage_CondTypical 507.443021
Paved_DrivePartial_Pavement -57.705945
Paved_DrivePaved 648.389813
Wood_Deck_SF 2883.560319
Open_Porch_SF 845.549475
Enclosed_Porch 658.823481
Three_season_porch 748.822559
Screen_Porch 2373.448110
Pool_Area -733.178528
Pool_QCFair -121.922959
Pool_QCGood -2150.199715
Pool_QCNo_Pool -181.142866
Pool_QCTypical -34.717025
FenceGood_Wood 209.763191
FenceMinimum_Privacy 202.637323
FenceMinimum_Wood_Wire 42.615237
FenceNo_Fence -452.719485
Misc_FeatureGar2 199.128314
Misc_FeatureNone 557.781148
Misc_FeatureOthr 277.197775
Misc_FeatureShed -91.339255
Misc_Val -2399.813910
Mo_Sold -512.204913
Year_Sold -425.866531
Sale_TypeCon 711.772879
Sale_TypeConLD -63.155014
Sale_TypeConLI -500.500845
Sale_TypeConLw -469.475064
Sale_TypeCWD 365.660799
Sale_TypeNew 1091.947209
Sale_TypeOth -107.779942
Sale_TypeVWD -321.043158
`Sale_TypeWD ` 165.002983
Sale_ConditionAdjLand 70.855584
Sale_ConditionAlloca 216.022157
Sale_ConditionFamily -1533.348043
Sale_ConditionNormal 1289.544776
Sale_ConditionPartial 1005.554932
Longitude -344.696083
Latitude 2225.813451
Approximate the relationship between a binary response variable and a set of predictor variables
Libraries
Code for the data, from previous chps
# attrition <- rsample::attrition # line in book chp1 no longer works
# data are moved into the `modeldata` package
df <- modeldata::attrition %>%
# make all factors unordered
mutate_if(is.ordered, factor, ordered = FALSE)
set.seed(123) # for reproducibility
churn_split <- initial_split(df, prop = .7, strata = "Attrition")
churn_train <- training(churn_split)
churn_test <- testing(churn_split)The formula of a sigmoid function looks complicated:
\[ p(X) = \frac {e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}} \]
Look at odds:
\[ \frac {p(X)} {1-p(X)} = \frac {e^{\beta_0+\beta_1X}}{1+e^{\beta_0+\beta_1X}} / \frac {1}{1+e^{\beta_0+\beta_1X}} = e^{\beta_0+\beta_1X} \]
And then take log, and call that logit (the log of the odds):
\[ log \left( \frac {p(X)} {1-p(X)}\right) = log \left(e^{\beta_0+\beta_1X} \right) = \beta_0+\beta_1X \]
Models are calculated using Maximum Likelihood
| term | estimate |
|---|---|
| (Intercept) | -0.8860896 |
| MonthlyIncome | -0.0001386 |
Increase of 1 unit in MonthlyIncome,
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -0.886 0.157 -5.64 0.0000000174
2 MonthlyIncome -0.000139 0.0000272 -5.10 0.000000344
2.5 % 97.5 %
(Intercept) -1.1932606571 -5.761048e-01
MonthlyIncome -0.0001948723 -8.803311e-05
Explaining attrition from MonthlyIncome and Overtime:
model3 <- glm(
Attrition ~ MonthlyIncome + OverTime,
family = "binomial",
data = churn_train
)
broom::tidy(model3)# A tibble: 3 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -1.33 0.177 -7.54 4.74e-14
2 MonthlyIncome -0.000147 0.0000280 -5.27 1.38e- 7
3 OverTimeYes 1.35 0.180 7.50 6.59e-14
churn_train3 <- # different from book:
# adds column "pred" to data
# with probs according to model 3
modelr::add_predictions(churn_train, model = model3, type = "response") %>%
mutate(prob = ifelse(Attrition == "Yes", 1, 0))
# also different from book
ggplot(churn_train3,
aes(x = MonthlyIncome, color = OverTime)) +
geom_point(aes(y = prob), alpha = .15) + # observations
geom_point(aes(y = pred)) + # predictions
labs(title = "Predicted probabilities for model3",
x = "Monthly Income",
y = "Probability of Attrition")Attrition ~ MonthlyIncome
set.seed(123)
cv_model1 <- train(
Attrition ~ MonthlyIncome,
data = churn_train,
method = "glm",
family = "binomial",
trControl = trainControl(method = "cv",
number = 10))
pred_class1 <- predict(cv_model1,
churn_train)
confusionMatrix(
data = relevel(pred_class1,
ref = "Yes"),
reference =
relevel(churn_train$Attrition,
ref = "Yes")
) Attrition ~ .
set.seed(123)
cv_model3 <- train(
Attrition ~ .,
data = churn_train,
method = "glm",
family = "binomial",
trControl = trainControl(method = "cv",
number = 10))
pred_class3 <- predict(cv_model3,
churn_train)
confusionMatrix(
data = relevel(pred_class3,
ref = "Yes"),
reference =
relevel(churn_train$Attrition,
ref = "Yes")
) Attrition ~ MonthlyIncome
Confusion Matrix and Statistics
Reference
Prediction Yes No
Yes 0 0
No 165 863
Accuracy : 0.8395
95% CI : (0.8156, 0.8614)
No Information Rate : 0.8395
P-Value [Acc > NIR] : 0.5208
Kappa : 0
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.0000
Specificity : 1.0000
Pos Pred Value : NaN
Neg Pred Value : 0.8395
Prevalence : 0.1605
Detection Rate : 0.0000
Detection Prevalence : 0.0000
Balanced Accuracy : 0.5000
'Positive' Class : Yes
Attrition ~ .
Confusion Matrix and Statistics
Reference
Prediction Yes No
Yes 83 20
No 82 843
Accuracy : 0.9008
95% CI : (0.8809, 0.9184)
No Information Rate : 0.8395
P-Value [Acc > NIR] : 8.982e-09
Kappa : 0.5658
Mcnemar's Test P-Value : 1.542e-09
Sensitivity : 0.50303
Specificity : 0.97683
Pos Pred Value : 0.80583
Neg Pred Value : 0.91135
Prevalence : 0.16051
Detection Rate : 0.08074
Detection Prevalence : 0.10019
Balanced Accuracy : 0.73993
'Positive' Class : Yes
No Information Rate : 0.8395: Predict most common outcome (“No”) for all, still accuracy 83.9%.
Accuracy: P(pred = actual), (TP+TN)/(TP+FP+TN+FN)
Sensitivity (recall): P(pred = “yes”| actual = “yes”), TP / (TP + FN)
Specificity: P(pred = “no”| actual = “no”), TN / (TN + FP)
Pos Pred Value (precision): P(actual = “yes”| pred = “yes”), TP / (TP + FP)
Neg Pred Value: P(actual = “no”| pred = “no”), TN / (TN + FN)
Prevalence: (TP+FN)/(TP+FN+FP+FN)
library(ROCR)
m1_prob <- predict(cv_model1,
churn_train, type = "prob")$Yes
m3_prob <- predict(cv_model3,
churn_train, type = "prob")$Yes
# Compute AUC metrics for models
perf1 <- prediction(m1_prob,
churn_train$Attrition) %>%
performance(measure = "tpr",
x.measure = "fpr")
perf2 <- prediction(m3_prob,
churn_train$Attrition) %>%
performance(measure = "tpr",
x.measure = "fpr")
plot(perf1, col = "black", lty = 2)
plot(perf2, add = TRUE, col = "blue")
legend(0.8, 0.2, legend = c("cv_model1", "cv_model3"),
col = c("black", "blue"), lty = 2:1, cex = 0.6)Other options for ROC curves:
https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/
R-Ladies theme for Quarto Presentations. Code available on GitHub.